JP-3653: Nsclean speedup #8547

t-brandt · 2024-06-11T02:19:39Z

This PR improves the run time of the NSClean algorithm (not counting the time to construct a pixel mask in NIRSpec) by a factor between 10 and 20. This is achieved by representing a diagonal weight matrix as a vector and using vector multiplication and broadcasting to achieve equivalent results without extra multiplications by zero and additions of zero. The changes are entirely within NSClean.fit(). Nothing outside of this routine appears any different, and I have verified that the results of the calculation are the same.

Checklist for PR authors (skip items if you don't have permissions or they are not applicable)

added entry in CHANGES.rst within the relevant release section
updated or added relevant tests
updated relevant documentation
added relevant milestone
added relevant label(s)
ran regression tests, post a link to the Jenkins job below.
How to run regression tests on a PR
All comments are resolved
Make sure the JIRA ticket is resolved properly

… code

codecov · 2024-06-11T02:51:09Z

Codecov Report

Attention: Patch coverage is 0% with 3 lines in your changes missing coverage. Please review.

Project coverage is 58.52%. Comparing base (b7e0b10) to head (52ebaf5).
Report is 337 commits behind head on master.

Files with missing lines	Patch %	Lines
jwst/nsclean/lib.py	0.00%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #8547      +/-   ##
==========================================
+ Coverage   58.02%   58.52%   +0.50%     
==========================================
  Files         388      388              
  Lines       38977    38958      -19     
==========================================
+ Hits        22617    22802     +185     
+ Misses      16360    16156     -204

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

drlaw1558 · 2024-06-11T02:58:01Z

@t-brandt Looks good to me; can you please add a change log entry?
@hbushouse Can you please kick off a regression test run for this?

t-brandt · 2024-06-11T14:10:45Z

Ok, I updated CHANGES.rst.

drlaw1558 · 2024-06-11T15:13:13Z

Thanks- minor comment that the number of dashes underneath the title 'nsclean' should be the same number of characters as the step name. I'm not honestly sure why, but @hbushouse has frequently had to remind me about it.

hbushouse · 2024-06-11T15:42:59Z

Thanks- minor comment that the number of dashes underneath the title 'nsclean' should be the same number of characters as the step name. I'm not honestly sure why, but @hbushouse has frequently had to remind me about it.

The number of underline characters needs to be at least as long as the title above it, but can be longer (without doing harm). It just can't be shorter or the sphinx parser will barf.

t-brandt · 2024-06-11T15:49:03Z

Ok, I adjusted the number of dashes.

braingram · 2024-06-11T16:05:44Z

I tried to run regtests for this PR:
https://plwishmaster.stsci.edu:8081/blue/organizations/jenkins/RT%2FJWST-Developers-Pull-Requests/detail/JWST-Developers-Pull-Requests/1513/tests
Unfortunately there are many errors that look unrelated (mostly in fgs related tests) but I'm not sure what's causing them. Here's a snippet from the traceback:

        # Make the query
        params = {'pattern': pattern}
        with requests.get(search_url, params=params, headers=headers) as r:
>           url_paths = r.json()['files']
E           KeyError: 'files'
/data1/jenkins/workspace/RT/JWST-Developers-Pull-Requests/clone/jwst/regtest/regtestdata.py:500: KeyError

melanieclarke · 2024-06-11T17:13:47Z

It looks like these changes are only in the NSClean class. Is it possible to make similar changes in the NSCleanSubarray class too? I'm wondering if this would enable support for the ALLSLITS subarray - right now, it uses so much memory that it crashes the kernel if attempted, so support for this subarray is explicitly turned off.

t-brandt · 2024-06-11T18:29:07Z

@melanieclarke The NSCleanSubarray already does something close to the changes I made. Unfortunately, I think that the reason it fails is

jwst/jwst/nsclean/lib.py

Line 471 in 41f24b7

B = np.array(np.exp(2*np.pi*1J*m*k/self.n)/m.shape[0], dtype=np.complex64)

This results in a matrix of size npix x npix (see

jwst/jwst/nsclean/lib.py

Line 465 in 41f24b7

m = (_y*(self.nx+self.nloh) + _x)[self.mask].reshape((-1,1))

)
If a subarray is 1000x100 pixels, for example, then Line 471 referenced above produces a matrix with 1e10 elements, which may not fit in RAM.

This problem does not affect the full array version of NSClean because that operates row-by-row, so that there are never more than 2048 pixels to be fit simultaneously, meaning that the equivalent matrix is at most 2048^2:

jwst/jwst/nsclean/lib.py

Line 163 in 41f24b7

    
           B = np.array(np.exp(2 * np.pi * 1J * m * k / self.nx) / m.shape[0], dtype=np.complex64)

This array size/memory issue in NSCleanSubarray can be fixed, but not quite so simply as with the pull request here. If it is a priority for the mission I can spend a bit more time to propose a fix.

drlaw1558 · 2024-06-11T18:34:36Z

Maybe I'm missing something, but can we simply swap subarray mode to run row-by-row as well?

t-brandt · 2024-06-11T18:41:41Z

@drlaw1558 Yes, we can do that. The best thing is probably to do something intermediate: to process as many rows simultaneously as we think we can profitably fit the Fourier modes. Assuming I am interpreting Bernie's code right, line

jwst/jwst/nsclean/lib.py

Line 336 in 41f24b7

def __init__(self, data, mask, fc=(1061, 1211, 49943, 49957),

defines the cutoff and cuton frequencies as ~50 kHz and ~1 kHz, which would correspond to 2 pixels and 100 pixels. So that would imply that doing a few rows at a time would be best, I think, with the number depending on subarray size. I don't think this would be a ton of work to implement, but I think it would be good to consult with Bernie to confirm my intuition. We might then compute backgrounds over overlapping regions and then use weighted averages to construct a smooth combined background.

drlaw1558 · 2024-06-11T18:58:40Z

@t-brandt Sounds like that's sufficiently different that it makes sense to consider separately rather than tying it to the work currently here. I'll file a separate ticket to track it; https://jira.stsci.edu/browse/JP-3654

hbushouse · 2024-06-12T22:43:25Z

Another regtest run started at https://plwishmaster.stsci.edu:8081/job/RT/job/JWST-Developers-Pull-Requests/1524

hbushouse · 2024-06-13T11:29:48Z

Regtest run https://plwishmaster.stsci.edu:8081/job/RT/job/JWST-Developers-Pull-Requests/1525/ had no failures or differences, so this looks good.

Timothy Brandt added 2 commits June 10, 2024 22:00

Changed p in fit() from a diagonal matrix to a vector to speed up the…

ebc45b3

… code

Changed p in fit() from a diagonal matrix to a vector to speed up the…

4b863a1

… code

t-brandt requested a review from a team as a code owner June 11, 2024 02:19

stscijgbot-jp mentioned this pull request Jun 11, 2024

Improve runtime of NSClean step #8548

Closed

Updated CHANGES.rst to reflect improvements in NSClean.fit()

29977ab

hbushouse changed the title ~~Nsclean speedup~~ JP-3653: Nsclean speedup Jun 11, 2024

hbushouse added this to the Build 11.0 milestone Jun 11, 2024

hbushouse added NIRSPEC nsclean NIRSpec NSClean algorithm labels Jun 11, 2024

Adjusted number of dashes in CHANGES.rst

31648d7

stscijgbot-jp mentioned this pull request Jun 11, 2024

Modify NSClean behaviour for subarray data #8551

Closed

Merge branch 'master' into nsclean_speedup

52ebaf5

hbushouse approved these changes Jun 13, 2024

View reviewed changes

hbushouse merged commit 831124e into spacetelescope:master Jun 13, 2024
27 of 28 checks passed

t-brandt deleted the nsclean_speedup branch June 14, 2024 12:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JP-3653: Nsclean speedup #8547

JP-3653: Nsclean speedup #8547

t-brandt commented Jun 11, 2024 •

edited by hbushouse

Loading

codecov bot commented Jun 11, 2024 •

edited

Loading

drlaw1558 commented Jun 11, 2024

t-brandt commented Jun 11, 2024

drlaw1558 commented Jun 11, 2024

hbushouse commented Jun 11, 2024

t-brandt commented Jun 11, 2024

braingram commented Jun 11, 2024

melanieclarke commented Jun 11, 2024

t-brandt commented Jun 11, 2024

drlaw1558 commented Jun 11, 2024

t-brandt commented Jun 11, 2024

drlaw1558 commented Jun 11, 2024 •

edited

Loading

hbushouse commented Jun 12, 2024

hbushouse commented Jun 13, 2024

JP-3653: Nsclean speedup #8547

JP-3653: Nsclean speedup #8547

Conversation

t-brandt commented Jun 11, 2024 • edited by hbushouse Loading

codecov bot commented Jun 11, 2024 • edited Loading

Codecov Report

drlaw1558 commented Jun 11, 2024

t-brandt commented Jun 11, 2024

drlaw1558 commented Jun 11, 2024

hbushouse commented Jun 11, 2024

t-brandt commented Jun 11, 2024

braingram commented Jun 11, 2024

melanieclarke commented Jun 11, 2024

t-brandt commented Jun 11, 2024

drlaw1558 commented Jun 11, 2024

t-brandt commented Jun 11, 2024

drlaw1558 commented Jun 11, 2024 • edited Loading

hbushouse commented Jun 12, 2024

hbushouse commented Jun 13, 2024

t-brandt commented Jun 11, 2024 •

edited by hbushouse

Loading

codecov bot commented Jun 11, 2024 •

edited

Loading

drlaw1558 commented Jun 11, 2024 •

edited

Loading